Adaptive physics-informed trajectory reconstruction exploiting driver behavior and car dynamics

As more and more trajectory data become available, their analysis creates unprecedented opportunities for traffic flow investigations. However, observed physical quantities like speed or acceleration are often measured having unrealistic values. Furthermore, observation devices have different hardware and software specifications leading to heterogeneity in noise levels and limiting the efficiency of trajectory reconstruction methods. Typical strategies prune, smooth, or locally modify vehicle trajectories to infer physically plausible quantities. The filtering strength is usually heuristic. Once the physical quantities reach plausible values, additional improvement is impossible without ground truth data. This paper proposes an adaptive physics-informed trajectory reconstruction framework that iteratively detects the optimal filtering magnitude, minimizing local acceleration variance under stable conditions and ensuring compatibility with feasible vehicle acceleration dynamics and common driver behavior characteristics. Assessment is performed using both synthetic and real-world data. Results show a significant reduction in the speed error and invariability of the framework to different data acquisition devices. The last contribution enables the objective comparison between drivers with different sensing equipment.


Results
Synthetic and real-world data are used to assess the robustness of the solution. Synthetic noisy data are generated by adding noise to observations obtained using highly accurate differential GPS devices (the AstaZero campaign of the OpenACC dataset 16 ). Data from this campaign are used as a reference, i.e., ground truth. Five noisy synthetic trajectories are generated, assuming the reference data are noise-free. These trajectories, namely N1, N2, N3, N4, N5, have zero mean Gaussian noise and increasing standard deviation from 0.05 to 0.25 [m/s] at frequency of 10 [Hz]. The noise levels reflect the error estimates given by commercial data acquisition systems, such as U-blox. Three state-of-the-art filtering techniques are tested within the framework in the NR component (see Fig. 1), i.e., Moving Average (MA), Lowess Polynomial Regression algorithm (LO), and Butterworth filter (LP), while other similar filtering approaches can also be hosted. We compare the performance of each method when used independently with parameters suggested in the literature versus the same method when employed within the proposed framework, where the filtering strength is adaptively adjusted. When the methods are used independently, fixed parameters are assumed, a time window of 3s for MA, a 7-step window for LO, and a cutoff frequency of 0.75 [Hz] for LP (see 20 for details). When the above techniques are used inside the proposed framework, their notation is pMA, pLO, and pLP, respectively, and their filtering magnitudes (window, steps, or cut-off frequencies) are inferred adaptively. Additional insights that demonstrate the robustness of the proposed methodology in comparison to more sophisticated methodologies for trajectory reconstruction (here used the Fard et al. technique 21 ) are given in the Appendix document. Figure 2a,b illustrate the I RMSE and I MAE indicators for each scenario, that is, the root mean square error and mean absolute error of speed indices, as they are described in the methodology. The incorporation of each one of the three tested methods (MA, LP, LO) within the proposed framework (pMA, pLP, pLO) leads to a significantly improved reconstruction. Higher noise levels correspond to higher errors in comparison to the ground truth. The LO and pLO methods are the best performing and independent of the data frequency, which is a considerable advantage 19 . The LO performs well even with fixed window size, and the efficiency is comparable to the more complex counterparts of pMA and pLO. Within the proposed framework (pLP), the error decreases drastically, and it outperforms all the other methods compared to the other methods. The MA and pMA are fast, overall reliable, and almost independent of the noise levels, while LP and pLP are not recommended for this task.
The proposed framework considers the specifications of the vehicle known. However, this is not always possible in real-world campaigns. Consequently, for large-scale application of the proposed framework, average vehicle dynamics 25 using representative vehicles from Euro Car Segments, as they are defined by European   The application of the proposed adaptive approach improves the efficiency of each method with ordinary filtering strength by up to 80% in terms of error reduction. However, aggregated indicators do not give insights about the resulting microscopic dynamics after the trajectory reconstruction, essential for accurate traffic flow and energy-related conclusions 4 .
Insights on the reconstructed microscopic dynamics follow in Fig. 3. Figure 3a,c,e,g,i illustrate the acceleration over speed values for each synthetic dataset N1, N2, N3, N4, N5 respectively (red dots). The blue dots correspond to the filtered acceleration speed values after the application of the LO method. Figure 3b,d,f,h,j show results with the proposed framework. The red dots correspond to the synthetic datasets, the green dots to the corrections applied by the VDC component, and the black dots are the output of the proposed framework with the pLO method. Figure 3k shows the reference data, while Fig. 3l provides the legend for all sub-figures. The comparison of Fig. 3a,j with Fig. 3k demonstrates some interesting findings regarding the ability of each method to capture the observed acceleration dynamics.
An oscillation of acceleration values for speeds between 0 and 15 [m/s] is prominent in the ground truth data. This oscillation is mostly filtered by the individual LO application, while it is still observable in the results of the proposed physics-informed framework, i.e., pLO. Furthermore, the main body of acceleration observations between speeds 15 [m/s] and 25 [m/s] for the proposed technique is visually close to the ground truth data. The LO method homogenizes this area, resulting in smoother but distorted dynamics compared to ground truth data. The results show the efficiency of the proposed framework, pLO even for high noise levels as in scenarios N3-N5.
The second campaign took place in Greece. A vehicle's trajectory is observed with two smartphones (S1 and S2), differing significantly in price, operating systems, and specifications. S1 costs half the price of S2, has Android, while S2 iOS, and records with an average rate of 1s ( Both signals are re-sampled to 1 [Hz] using linear interpolation. A lack of synchronization is most probably due to the internal clock of each device. Cross-correlation is used to estimate the time lag between the signals 5 . Figure 4 shows the speed and acceleration profiles per measurement device before and after the application of pLO method, which was the most efficient according to the results of the first campaign. The observed profiles are shifted in time, as mentioned above, to become synchronized. Both smartphones record the same vehicle trajectory (being with the driver), and the observed speed profiles are shown in the sub-figure Fig. 4a. Then the pLO is applied to both speed profiles, and the result is illustrated in the sub-figure Fig. 4b. Visually inspection reveals that most of the differences between the two signals have been smoothed, and it is obvious that they refer to the same measured trajectory. Going one order up, the acceleration observations are shown in the subfigure Fig. 4c. Here, the differences between the observations of the two devices are more prominent. After the application of the proposed framework (pLO), the generated acceleration signals become very similar, pointing to the same measurements; see the sub-figure Fig. 4d. This result is very desirable in trajectory reconstruction. It can provide reliable results for several applications involving user comparisons concerning driving behaviors, fuel consumption profiles, and others. Table 2 provides a quantitative evaluation for the second campaign, comparing the mean absolute error (MAE) and the median absolute deviation (MAD) between the two observed signals before and after the application of the proposed methodology. The comparative results refer to speed and acceleration values. In terms of speed, the improvement by the adaptive methodology is around 22.7% and 18.2% for MAE and MAD. The corresponding values for acceleration are even higher, i.e., 69.7% and 72.5% for MAE and MAD.

Discussion
The vehicular and driving behaviors are known in traffic engineering as perhaps the most significant factors for raising complexity and leading to the appearance of non-linear phenomena in road transport systems. Until recently, experimental observations regarding detailed vehicle trajectory data have been scarce. During the last decade, technological advances and cost reductions in sensors enabled the organization of large and complex experimental campaigns. Such datasets, many of which are publicly available, provide invaluable insights into traffic engineering topics, provided they have a low error in observations. www.nature.com/scientificreports/ There is no standardized protocol for designing and executing such experiments. Therefore, each experiment is performed with a different acquisition device, e.g., smartphones, U-blox, OXTS, etc. Such devices have various hardware and software specifications; therefore, the output signals are different even under the same conditions. Noise removal in the measurements is often performed with arbitrarily parametrized filtering approaches, leading to questionable results since no ground-truth reference is available. The main idea is to remove by thresholding obvious outliers, i.e., values with no physical meaning, and then smooth the observation series www.nature.com/scientificreports/ globally or locally (on time or frequency domain) to derive a set of plausible measurements. The above strategy is problematic by design since vehicular and driver dynamics are nonlinear and observations with different devices need custom parametrization that is not obvious. Exploiting the recent modeling advances in longitudinal vehicle dynamics simulation and driver behavior, the current methodology explicitly considers these two dimensions for trajectory reconstruction. The framework design is flexible and model-agnostic. A convergence process initiates, and an iterative process automatically adjusts the filtering strength based on vehicle and driver dynamic constraints.
Results on synthetic data generated based on low-error reference observations show that the proposed framework remarkably removes outliers and noise, maximizing the efficiency of the employed filtering strategy by automatically parametrizing the filtering strength. Furthermore, real-world observations on the same trajectory with two different devices show significant differences between measurements. The proposed framework is applied in both measured trajectories that visually demonstrate remarkable resemblance afterward, which is normal as they refer to the same experiment.
The above results are unique among trajectory reconstruction methodologies. Compliance with realistic nonlinear vehicle and driver dynamics ensures a fair comparison of observed trajectories and behaviors in various topics such as traffic flow, safety, driver identification, etc. Furthermore, the possibility to identify the same experiment from observations with different noise levels facilitates cross-driver comparisons providing reliable insights into topics such as energy consumption, driver aggressiveness, etc.
The main assumption of this framework is that the vehicle specifications, i.e., gearbox, gear ratio, mass, maximum torque, etc., are considered known for each vehicle trajectory 25 . This assumption is quite heavy because this information is not always given, especially in large complex experimental campaigns. However, it is shown that the vehicle dynamics can be efficiently clustered based on some main specifications 35 . The vehicles in the same  www.nature.com/scientificreports/ cluster demonstrate similar dynamics, so no vehicle-specific details are necessary. Alternatively, there are more generic models than MFC, see for example 26 , that use average vehicle specifications and are much more flexible to be used within the proposed framework. It is worth noting that the dynamics of electric powertrains differ significantly from vehicles with Internal Combustion Engines, as the first can offer much higher torque from very low speeds. Therefore a different modeling approach should be used for electric vehicles. An implementation of the MFC for electric powertrains is also publicly available 36 . Finally, an extension or adoption of the proposed framework to capture both lateral and longitudinal dynamics would be interesting as a future work 37 .

Methods
A physics-informed adaptive framework for trajectory reconstruction is proposed in this work. The algorithm employs three main components, as illustrated in Fig. 1: (a) the Vehicle Dynamics Constraint (VDC) process, (b) Driver Dynamics Compliance (DDC) process, and (c) Noise Reduction (NR) process. The next subsections describe these components individually, while the last subsection here presents the experimental campaigns and scenarios used for assessing and validating the approach, as well as the performance indicators. Before discussing the individual components, it is important to elaborate on how vehicle and driver dynamics are modeled and considered in this approach. Figure 5 aims to clarify vehicle and driver dynamics from a modeling perspective in an illustrative way. Using a vehicle's specifications (powertrain, engine power, mass, etc.), MFC microsimulation model 25 , used in this framework, describes the vehicle's continuous acceleration capacity function, namely a p (v) . This function returns the maximum possible acceleration for this specific vehicle at any given speed v. Figure 5 depicts this with the orange line.
Similarly, we compute the continuous minimum comfortable deceleration curve 36 , namely d p (v) (non-safetycritical situations). This function provides the minimum possible acceleration for this specific vehicle and a given speed v. Figure 5 depicts this with the red line.
These two curves inscribe our vehicle power domain. In other words, assuming that the employed vehicle dynamics model is accurate, any real acceleration value for a given speed that falls outside this domain is considered unrealistic (infeasible). Although the results in this work refer to specific vehicle models, it should be noted that the proposed methodology can be expanded for vehicle classes, according to Euro Car Segments, defined by the European Commission 38 , without significant loss of precision.
Furthermore, driver behavior observations reveal that most drivers do not even approach the acceleration capacity of their vehicle for any given speed. Recently, Makridis et al. 34 conducted an unbiased experiment with 20 individuals driving freely (without any given instructions) across Europe the same vehicle over one year. The authors proposed a methodology to characterize the aggressiveness of these drivers and possibly cluster them in groups. One of the conclusions in that work has been that in all observations, maximum observed acceleration never exceeded 50% of the vehicle's acceleration capacity for that given speed. Based on this conclusion, we compute the continuous acceleration capacity function of the driver, namely a d (v) as follows: where v is a given speed and d is a scaling parameter. As mentioned above, in the current work, this parameter is fixed to 0.5, which is the recommended value for common-purpose car-following/driving experiments. In www.nature.com/scientificreports/ particular cases, such as experiments with free-flow accelerations to racing events, d should be tuned to values much closer to 1. This function provides the maximum estimated acceleration for most drivers and a given speed v. Figure 5 depicts this threshold with the green line. Using the above three functions, a p (v) , d p (v) and a d (v) , we partition the acceleration over the speed domain for all possible observations in four main regions (Fig. 5) as follows: • Region A: This set, R A , includes all possible acceleration values for a given speed that the vehicle can not exploit. Any observation inside Region A is considered an outlier and should be box-constrained to a realistic value. • Region B: This set, R B , includes all possible acceleration values for a given speed that the vehicle can exploit but are considered not highly probable for common (ordinary) driving conditions. Any observation inside Region B is considered a noisy measurement and therefore indicates the necessity for noise reduction. • Region C: This set, R C , includes all possible acceleration values for a given speed that the vehicle can exploit and can potentially correspond to ordinary driving conditions. Any observation that lies inside Region C is acceptable as an accurate measurement. Therefore, there is no way to assess it and evaluate if the measurement corresponds to the actual value or not. • Region D: This set, R D , includes all possible acceleration values for a given speed outside a driver's comfortable deceleration values. Any observation inside Region D is considered an outlier and should be box-constrained to a realistic value.
In the context of the current application of reconstructing vehicle trajectories, we denote x(t) as the time series of position measurements for a given vehicle. Moreover, s is the time-step, and f = 1/s is the sampling frequency. Correspondingly, speed and acceleration measurements are obtained with derivation, namely ẋ(t) and ẍ(t) . Additionally, if N A , N B , N C , N D are the observations that fall in the corresponding regions R A , R B , R C and R D , respectively, then, N A + N B + N C + N D = N . Finally, the present study can be employed for electric vehicles that demonstrate different power characteristics by utilizing the corresponding version of the MFC model 36 .
Vehicle dynamics constraint. The goal of VDC is to ensure that all the observations in regions R A and R B are box-constrained to physically possible values, i.e., orange and red lines at the boundaries of regions R B and R C in Fig. 5. An advantage of the proposed framework over existing works is that it constrains the outliers nonlinearly. It is a common phenomenon in experiments to produce outliers, values that do not make any sense from the point of view of physics. Such values may appear due to different factors, such as weather conditions, road geometry, sensor errors, malfunctions, and others. Hard thresholding is commonly used in the literature toward this scope, i.e., removing accelerations above 5 [m/s 2 ] and below −8 [m/s 2 ] . However, the application of a horizontal threshold is not efficient. For instance, an acceleration of 3 [m/s 2 ] can be achieved by a vehicle under low speeds, but it is unrealistic for high speeds, i.e., over 100 [km/h]. Reliable data acquisition systems might capture only a few such observations, i.e., that lie in R A or R D , as they could also have their built-in filtering techniques. However, when data are collected with low-accuracy devices (e.g., mobile phones), this process plays an essential role.
The resulted observations set that is constrained by vehicle dynamics is called ẍ vdc and can be described as follows: This process is applied iteratively inside the proposed methodology for increasing filtering strength, generating a new set of values at each iteration. Therefore, the paper uses an indicator l to describe the loop count wherever necessary.
Driver dynamics compliance. The goal of DDC is to ensure that all observations respect the average driver behavior and automatically assess the filtering magnitude that will be applied by the noise reduction technique. Ultimately, the framework should work independently of the noise reduction methodology and its parameters related to the filtering strength. One of the main problems in traditional trajectory reconstruction techniques is the definition of granularity in the noise reduction processes. For example, in low-pass filtering, the cut-off frequency can vary depending on the noise levels in the raw data and/or sampling frequency. Similarly, in polynomial regression or moving average, the span of the window that will be used for smoothing can heavily distort the observed profile. The proposed compliance check focuses on ensuring the following requirements: • (RQ1) All acceleration values correspond to common driver aggressiveness levels.
• (RQ2) The variance of sequential accelerations locally in time is close to the variance typically observed in high-accuracy data acquisition systems (e.g., Differential GPS).
The first requirement, RQ1, aims at ensuring that measurements correspond to typical drivers. Typical drivers do not exploit the full power and, thus, acceleration capacity of their vehicle 34 . The term aggressiveness is used within this work to characterize drivers that accelerate sharper than others, i.e., approaching the vehicle's maximum acceleration for a given speed. According to Fig. 5, all observations should lie in the area below the www.nature.com/scientificreports/ orange line, i.e., the maximum possible acceleration of the vehicle at a given speed. In practice, most observations fall around the middle of the area inscribed by the orange and red lines (comfortable deceleration as a function of speed). Observations near the orange line or below the red line are rarely observed, indicating excessive aggressiveness in driving. In our opinion, this is something to be considered in trajectory analysis. Of course, it should be noted that for particular types of experiments (e.g., maximum acceleration from 0 to 100 km/h), this requirement should be relaxed, i.e., the green line in Fig. 5 should be defined closer to the orange one ( R B area becomes smaller). This process validates the ratio r l = N C l /N , which is the number of observations that fall inside R C , for iteration l, over the total number of observations N. The ratio r l is parametrized for every iteration l because the number of values that fall in R C can differ in every iteration. According to this criterion, an acceptable dataset should contain only a few values outside region R C . Therefore, this ratio needs to be lower than a fixed threshold, namely d th (here set to 0.05, or 5% of the observations). The second requirement, RQ2, aims to monitor the acceleration signal's local variations. Under real-world non-critical driving conditions, the local variance of accelerations should be relatively low. In the frequency domain, the above irregular pattern is often encountered in empirical observations due to noise, and therefore, low-pass filtered techniques are commonly used to mitigate this effect. Consequently, even if all observations are compatible with vehicle and driver dynamics, the magnitude of local acceleration variations should be low before the application of a smoothing technique and therefore plays a decisive role in the parametrization of the filter's strength (i.e., window or cut-off frequency).
On the other hand, the amount of noise in a signal directly impacts the observed variations. Furthermore, each technique has different efficiency in alleviating these variations, which is not known beforehand. The proposed work estimates the parameters that impact the filter strength based on the above. Assuming that we have N acceleration observations ẍ(t) , their local standard deviation σẍ ,l (k) , for iteration l around k, and a fixed window w var (set to 1.5 s) is computed as follows: where Since the acceleration values change, the local standard deviation is parametrized per iteration, l. Furthermore, we consider as an indicator the median value of all observed local standard deviations, med({σẍ ,l (k)}, ∀k) , namely med(σẍ ,l ) . If we consider that most of the time, the driver (either human or automated) aims at maintaining a constant speed, we can assume that med(σẍ ,l ) will correspond to a prevalent value. However, standard noise directly impacts individual med(σẍ ,l )(k) values and consequently the global med(σẍ ,l ) indicator. Furthermore, we define the following function: The idea behind function f (w l ) is to automatically determine the filtering magnitude for the noise reduction component based on the reduction of the acceleration's local variability. Specifically, increasing filtering strength at each iteration l, i.e., increasing the window/step size or lowering the cut-off frequency, σ is expected to decrease respectively. At the same time, the rate of med(σẍ ,l ) , which in our discrete system is the normalized difference between med(σẍ ,l−1 ) and med(σẍ ,l ) , decreases as well, towards an optimal result corresponding to the applied technique. However, after passing the optimal threshold, we expect the function f to stop being monotonically decreasing and most likely fluctuate as the filter strength increases further (larger windows or lower cut-off frequencies).
An example of f function trajectory over iterations is shown in Fig. 6a. The trajectory's three markers (red, yellow, and blue) correspond to the optimal, typical, and over-filtered window sizes. The idea is that until the red marker, we have high certainty that the noise decreases as the local standard deviation rate decreases. After that iteration, there is little knowledge about real noise reduction. Typical thresholds, depending on the dataset, might lead to a reasonable noise reduction level as was shown in Fig. 2a,b, but the final result depends on the dataset specifications (noise, frequency, etc.). The proposed technique ensures a satisfactory result and automatic inference of the filter's magnitude. Figure 6b illustrates the acceleration over speed diagrams for the typical, optimal, and over-filtered points, showcasing the robustness of the proposed process.
Concretely, we set the filtering range threshold to the maximum filter range that can ensure that function f remains monotonically decreasing, that is, the size of the window or cut-off frequency w l for which f (w l ) > f (w l−1 ) . To this end, we have: (5) f (w l ) = med(σẍ ,l−1 ) − med(σẍ ,l ) med(σẍ ,l ) . Performance assessment. The assessment of trajectory reconstruction methodologies is challenging when there is no ground truth data. In this paper, we employ two experimental campaigns with different campaigns to perform our assessment. Campaign 1: The AstaZero experiments described in OpenACC dataset 16 include low-noise position observations from multiple vehicles inside the AstaZero test track in Sweden. The measurements have low noise due to the differential GPS data acquisition equipment. This work uses around 25km of a vehicle trajectory in this test track. Because of the initial low noise levels, we consider this reference trajectory as ground truth. Gaussian noises of zero means and increasing standard deviation values are added to create noisy synthetic speed profiles. More specifically, 5 levels of standard deviation in [m/s] are considered in the results, {0.05, 0.1, 0.15, 0.2, 0.25} . The proposed framework is tested for the above three noise reduction methods and five noise levels. Moreover, comparisons are presented for the three noise reduction methods with parameters proposed in the literature.
For comparison indicators, the mean absolute error and root mean squared error on the speed profiles are used; |ẋ −x| Figure 6. Example of f function over the number of iterations for the LO method with. Table 3. High-level algorithmic workflow.

Initialization
Step 1: Load raw data Step 2: Initialize filter strengths w MA , w LO or f LP Step 3: Apply VDC Step 4: Check DDC: If PASS GoTo Step 9; Else GoTo Step 5 Step 5: Set w MA = w MA + 1 , w LO = w LO + 1 or f LP = cf LP − 0.05 Step 6: Apply Noise Reduction (MA, LO or LP) Step 7: Apply VDC Step 8: Check DDC: If PASS or filter strength is maximum GoTo Step 9; Else GoTo Step 5 Step 9: END www.nature.com/scientificreports/ where N is the number of observations, ẋ the ground-truth and x the reconstructed trajectory. Campaign 2: This campaign includes the observed trajectories recorded by the sensors of two smartphones via the Phyphox application 39 . Two different smartphone devices, one less and another more expensive, with Android and iOS operating systems, respectively, have been used during the same experiment. The idea is to assess the consistency of the proposed framework for different sensors and error levels on the same trajectory. We compute the mean absolute error and median absolute deviation between the two signals to provide quantitative assessment values.